The (Statistical) Power of Mechanical Turk

نویسنده

  • Amelia Kimball
چکیده

In this paper, I argue for the use of Amazon Mechanical Turk (AMT) in language research. AMT is an online marketplace of paid workers who may be used as subjects, which can greatly increase the statistical power of studies quickly and with minimal funding. I will show that—despite some obvious limitations of using distant subjects—properly designed experiments completed on AMT are trustworthy, cheap, and much faster than traditional face-to-face data collection. Not only this, but AMT workers may help with data analysis, which can greatly increase the scope of research that one researcher may carry out. This paper will first argue several reasons for using online subjects, then quickly outline how to build a survey-type experiment using AMT, and finally review several best practices for ensuring reliable data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rating Computer-Generated Questions with Mechanical Turk

We use Amazon Mechanical Turk to rate computer-generated reading comprehension questions about Wikipedia articles. Such application-specific ratings can be used to train statistical rankers to improve systems’ final output, or to evaluate technologies that generate natural language. We discuss the question rating scheme we developed, assess the quality of the ratings that we gathered through Am...

متن کامل

Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and f...

متن کامل

Annotating Large Email Datasets for Named Entity Recognition with Mechanical Turk

Amazon's Mechanical Turk service has been successfully applied to many natural language processing tasks. However, the task of named entity recognition presents unique challenges. In a large annotation task involving over 20,000 emails, we demonstrate that a compet­ itive bonus system and inter­annotator agree­ ment can be used to improve the quality of named entity annotations from Mechanical ...

متن کامل

Active Learning and Crowd-Sourcing for Machine Translation

In recent years, corpus based approaches to machine translation have become predominant, with Statistical Machine Translation (SMT) being the most actively progressing area. Success of these approaches depends on the availability of parallel corpora. In this paper we propose Active Crowd Translation (ACT), a new paradigm where active learning and crowd-sourcing come together to enable automatic...

متن کامل

On Recovering the Structure of Affect

This paper presents one of the few human computation experiments geared towards uncovering the structure of affect. Using Mechanical Turk workers across two separate studies, we empirically verified some of the popular beliefs about the structure of affect, but also provide some new evidence. We replicate and reveal not only the statistical structure of dimensions of affect, but also the effect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014